Algorithms for Computing Variants of the Longest Common Subsequence Problem

نویسندگان

  • Mohammad Sohel Rahman
  • Costas S. Iliopoulos
چکیده

The longest common subsequence (LCS) problem is one of the classical and well-studied problems in computer science. The computation of the LCS is a frequent task in DNA sequence analysis, and has applications to genetics and molecular biology. In this paper we introduce new variants of LCS problem and present efficient algorithms to solve them. In particular we introduce the notion of gap constraints in the LCS problems. For the LCS problem with fixed gap, we first present a naive algorithm runs in O(n2 +R(K + 1)2) time, where R is the total number of ordered pairs of positions at which the two strings match and K is the fixed gap constraint. We then improve the running time to O(n2 +RK +R log log n) using some novel techniques. Furthermore, we present an algorithm that is independent of K and runs in O(n2 +R log log n) time. Using these techniques, we also present a new O(n2) algorithm to solve the original LCS problem. Additionally, we modify our algorithms to handle elastic and rigid gaps. We also apply the notion of rigidness to the original LCS problem and modify the traditional dynamic programming solution to handle the rigidness presenting a O(n2) algorithm to solve the problem. Finally, we also improve the solution to Rigid Fixed Gap LCS to O(n2). Notably, in all of the above cases, we assume that the two given strings are of equal length i.e. n. But our results can be easily extended to handle two strings of different length. c © 2008 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Algorithms for Computing the Longest Parameterized Common Subsequence

In this paper, we revisit the classic and well-studied longest common subsequence (LCS) problem and study some new variants, first introduced and studied by Rahman and Iliopoulos [Algorithms for Computing Variants of the Longest Common Subsequence Problem, ISAAC 2006]. Here we define a generalization of these variants, the longest parameterized common subsequence (LPCS) problem, and show how to...

متن کامل

Algorithms for Computing Variants of the Longest Common Subsequence Problem ? ( Extended

The longest common subsequence(LCS) problem is one of the classical and wellstudied problems in computer science. The computation of the LCS is a frequent task in DNA sequence analysis, and has applications to genetics and molecular biology. In this paper we define new variants, introducing the notion of gap-constraints in LCS problem and present efficient algorithms to solve them. The new vari...

متن کامل

Constrained Longest Common Subsequence Computing Algorithms in Practice

The problem of finding a constrained longest common subsequence (CLCS) for the sequences A and B with respect to the sequence P was introduced recently. Its goal is to find a longest subsequence C of A and B such that P is a subsequence of C. There are several algorithms solving the CLCS problem, but there is no real experimental comparison of them. The paper has two aims. Firstly, we propose a...

متن کامل

On the Constrained Longest Common Subsequence Problem

The problem of the longest common subsequence is a classical distance measure for strings. There have been several attempts to accommodate longest common subsequences along with some other distance measures. There are a large number of different variants of the problem. In this paper, we consider the constrained longest common subsequence problem for two strings and arbitrary number of constrai...

متن کامل

New Algorithms for the Longest Common Subsequence Problem New Algorithms for the Longest Common Subsequence Problem New Algorithms for the Longest Common Subsequence Problem

Given two sequences A = a 1 a 2 : : :a m and B = b 1 b 2 : : :b n , m n, over some alphabet , a common subsequence C = c 1 c 2 : : :c l of A and B is a sequence that can be obtained from both A and B by deleting zero or more (not necessarily adjacent) symbols. Finding a common subsequence of maximallength is called the Longest CommonSubsequence (LCS) Problem. Two new algorithms based on the wel...

متن کامل

A Load Balancing Technique for Some Coarse-Grained Multicomputer Algorithms

The paper presents a load balancing method for some CGM (Coarse-Grained Multicomputer) algorithms. This method can be applied on different dynamic programming problems such as: Longest Increasing Subsequence, Longest Common Subsequence, Longest Repeated Suffix Ending at each point in a word and Detection of Repetitions. We present also experimental results showing that our method is efficient.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Theor. Comput. Sci.

دوره 395  شماره 

صفحات  -

تاریخ انتشار 2006